A Portable, Cross-platform Emulation System

نویسنده

  • John Casey
چکیده

Traditionally, emulation systems have been unnecessarily written and compiled for specific computer architectures. Therefore, emulations cannot easily be ported to other computing platforms. This paper proposes a portable emulation system, where the dependencies on particular host architectures are minimised by developing the emulation runtime, in the portable environment Java. Using this system architecture, the emulation can be executed on any computer platform that has a Java runtime component. Using this framework, a pilot system has been developed that emulates a subset of the 8086 processor using interpretation. The relative performance of the emulation is then evaluated by comparing the benchmark results with those attained by the portable emulation and the natively compiled 80x86 Bochs emulation, which uses a similar interpretive mechanism to the pilot system. Additionally, the system is compared to the 8086, 80486 and Pentium 3 processors. The benchmark programs are processor bound test sequences of 8086 instructions, and are comprised of specific groupings of similarly formatted instructions. The performance of the pilot system is better than the 8086 but not as good as the 80486, Pentium 3 or Bochs emulator. The performance differences between the pilot system and Bochs highlight the performance trade-offs that exist between portability and raw performance. INTRODUCTION Computer emulation is a software migration technique that allows legacy software systems to execute as they were designed on newer hardware and software platforms. Emulation software provides translation mechanisms using various techniques to reproduce the logic and design of the system(s) the software was originally developed for. Thus, the process of emulation development is naturally an extremely complex and time consuming task. Currently, a number of emulation systems have begun to use processor definition files and driver based emulation frameworks. These emulation architectures allow the core emulation component to be re-used in the development of multiple emulation systems for various host environments. In this paper, a simpler architecture is proposed, where the entire emulation system is written specifically for a virtualised, portable environment such as Sun’s Java. Using this approach, the complexities involved in the development of a portable emulation can be reduced to the equivalent of developing a standalone emulation. Such a system can be used on any computer platform, where a Java Virtual Machine is available. It is expected that such a portable emulation will be able to decode and execute instructions many times faster than the original 8086, when running on modern hardware. However, the performance differences between the portable emulation and the Bochs emulation are expected to highlight the performance, portability trade-off that exists in such portable environments. This paper is organised as follows. Section 2 examines the related work. Section 3 deals with the architecture and development of the portable emulation proposed in this paper. In Section 4 the performance of this emulation system is evaluated, and compared with contemporary emulation and hardware systems. Finally, section 5 concludes the paper and discusses the further work. Casey. Portable Emulation Proceedings of the First Australian Undergraduate Students’ Computing Conference, 2003 page 19 RELATED WORK This section is divided into two parts. The first part discusses current emulation processes and technology, whilst the second part examines portability issues in emulation. In the first part an overview is given of the various emulation processes, as well as an indication of their relative performance and compatibility. The second part solely focuses on issues of portability, reusability, and their use in emulation. Emulation Processes Interpretation is the baseline emulation technique, and mirrors the fetch, decode, and execute loop that processors use to execute programs. In emulations, this process typically emulates one instruction at a time. Unfortunately, interpretive emulation mechanisms carry a high overhead that can slow emulation speed. The interpretive algorithm is the simplest emulation mechanism, and is able to faithfully replicate exotic programming constructs such as polymorphic code. Interpretation is used in several emulation systems, such as the PC emulation Bochs [Law99] and Apple’s 68000 Macintosh emulation [App94]. In comparison, static binary translation systems recompile the machine code of a program to that of a target system. Once an executable program has been translated, it can be run as a native program on the target computer. Translated executables can run at near full speed without any interaction with the translation program. However, binary translation cannot correctly translate polymorphic code. Consequently, many translators work in conjunction with interpreters and use the interpretive method as a fall back approach. A number of popular emulation systems use this technique such as Digital’s Freeport express [Dig95], and FX!32 [HH97] systems among others. Dynamic binary translation is an extension to static binary translation, and uses a just-in-time approach to generate a cache of native machine code instructions. These translated instructions can be called directly by a host systems processor, without having to repeatedly re-interpret emulated instructions. The dynamic nature of these translation processes allows emulation software to identify all sections of program code. Presently, dynamic translation is becoming the dominant emulation mechanism in many emulation environments, such as Apple’s re-written 68k Macintosh emulator [App95], ARDI’s Executor [Hos95], IBM’s DAISY environment [EA97] and notably in virtual machine environments such as Sun’s Java[Sun01]. Portable Emulations Emulations are complicated systems to design, develop and implement. As such, an in depth and intimate knowledge of the hardware and software systems used in both the source and target computer systems is required. The engineering process of emulations typically locks the system into a single source and target system. In recent years this problem has been overcome somewhat, by re-writing emulation systems in portable languages such as ANSI C and C++. This allows an emulation system to be rebuilt for any target platform, where an appropriate compiler is available. Unfortunately, while a target system can be manually adjusted and retargeted to suit multiple platforms, the source or emulated system is hard coded as program statements and instructions. Additionally, such emulations may not be as hardware independent as they could be, due to limitations of the language and hardware specific optimisations. Consequently, this has lead to the development of emulation systems which rely on processor and system specification files, which define the format, syntax and behaviour of hardware and software functions. [CS97] have developed SSL, which is an instruction specification encoding language. In their research, they define the syntax and semantics of the instructions in a procedural manner, which specifies the interactions of an instruction with the registers and address space of a processor. A similar process is used in Syn68k [Hos95], a portable dynamic binary translation system that emulates the 680x0 family of processors and is used in Executor, the Macintosh emulator. The core of the Syn68k emulation is based around an instruction specification system called Syngen and this is used at compile time to generate the system’s emulation and translation functions. At compile time, Casey. Portable Emulation Proceedings of the First Australian Undergraduate Students’ Computing Conference, 2003 page 20 Syngen optimises the code generated from the specification for different host environments, based on the architectural details of particular processors and software systems. However, the focus of Syngen is optimisation rather than portability. Using a similar automated process to Syngen, the University of Queensland’s dynamic binary translator (UQDBT) [UC00] provides a core emulation framework that is supplemented with code generated from semantic specific files. In contrast to the performance focus of Syn68k, the UQDBT system centres on retargeting the core emulation, so it can support various source and target computer systems. However, UQDBT cannot be easily retargeted to new platforms so easily. The system building process requires new system definition files to be created, and for the entire system to be recompiled and retested as well. Therefore, the source code of the UQDBT emulation is portable between different platforms, whilst the compiled executables are not. Other emulation toolkits, notably the open source MAME (Multiple Arcade Machine Emulator) [Mam96] and MESS (Multiple Emulation Super System) [Mes98] projects use a modular, driver based architecture. Using this structured approach, the functionality of various processor and I/O device emulation modules can be re-used through the use of simple driver programs. These driver programs glue processor and I/O device modules together to emulate an entire computer system. The system architectures of these systems are similar to retargetable systems in that the basic core system can potentially support numerous emulations. However, MAME and MESS differ in that they support emulations using pre-compiled code libraries, rather than the specification file and code generation techniques of Syn68k and UQDBT. The retargetable and driver based emulation frameworks examined here provide a fast method for developing processor efficient emulation systems. Unfortunately, these system frameworks depend on pre-existing system definition files and processor emulation libraries. Consequently, using these architectures, the development of emulations can be quite complex. To appreciate the complexity of the development process, consider the following examples. In retargetable emulation frameworks, the most common architectural elements of a processor must be supported as well as the more exotic features as well. Therefore, the core functionality of these emulations can be complicated with functions that may not be necessary for the emulation of common processor designs. Conversely, the core emulation may be missing necessary emulation functions for particular computer architectures, which means more functionality will have to be developed and added to the system. As with retargeting emulations, the development of driver based systems can also be complicated. These systems require a library of pre-written processor and I/O device emulations, and these will take a long time to develop if they do not exist already. Moreover, to support the emulation of new computer architectures, extra processor and I/O device emulations will have to be developed and added to the library system. The research documented in the remainder of this paper examines the performance characteristics of such a portable emulation system and compares it with the performance of a natively compiled emulation, and several comparable hardware systems. DEVELOPMENT PROCESSES This section outlines the architecture, steps and processes involved in the development of a portable emulation system. Initially, the scope, goals and requirements of the system architecture will be examined. This will be followed by an examination of the different system components that comprise the system architecture. For the purposes of this study, the emulation development focuses on a subset of the entire emulation process. Consequently, the emulation system has concentrated on the core components of an emulation system. These core components include the processor module, memory system and register set. As such, the emulation does not attempt to replicate a computer’s input or output systems. The Intel 8086 architecture has been selected as the target of this emulation, as contemporary systems are still able to execute 8086 programs. Casey. Portable Emulation Proceedings of the First Australian Undergraduate Students’ Computing Conference, 2003 page 21 ` CPU Memory Register Emulation Figure 1 Hardware System Operating System Java Virtual Machine As this system is a pilot study, interpretation has been selected as the primary method of program execution. As stated in the literature, the interpretive algorithm is very inefficient. However, the simplicity and accuracy of interpretation compensates for the performance problems associated with the method. Moreover, it is unlikely a dynamic binary translation system could be implemented easily within the Java environment as there is no simple way to generate useable byte-code at runtime. Finally, the instruction set of the emulation has been limited, so that only common instructions and addressing modes were emulated. Consequently, the processor emulation does not represent a full emulation, rather a test-bed system suitable for testing purposes only. Therefore, the primary purpose of the emulation is to generate program execution times, which can be used to compare the performance of portable emulation to that of the corresponding hardware systems. Candidate instructions and addressing modes were selected for emulation on the basis of how often they would be used in general 8086 programs. Using this approach approximately 60 percent of the 8086 instruction set was emulated. Figure 1 illustrates the layered architecture of the emulation system. As shown, the Java virtual machine acts as an abstraction layer that allows the emulator to run on any computer platform. The CPU module is the main component of the emulator. It repeatedly interprets and executes instructions using a switch decode loop, and uses function calls to execute decode instructions. These functions may interact with the systems memory or register objects. The memory component comprises an integer array, which spans the 8086’s 1 MB address space. Decode/execute functions can read or write values to memory simply by indexing the integer array. Simple array indexing is fine for reading and writing single byte values. However, read() and write() routines have been written for addressing word values, which have to be byte-swapped for the little endian memory format of the 8086. The system registers simply use integer variables to represent the word addressable registers of the 8086 and these are updated using assignment statements. However, the data registers of the 8086 are both byte and word addressable. Therefore, with these registers a separate class structure has been developed with methods that allow the register to be accessed as a word, or as a low or high byte value. Bitwise masking operations are used to perform these addressing functions due to the lack of a union structure in Java. PERFORMANCE ANALYSIS To evaluate the proposed emulation architecture, a series of benchmarks have been developed to examine the instruction execution performance of the system. The benchmarks consist of simple processor bound programs that group the machine code instructions that have similar formatting and function. This allows the performance characteristics of different instruction groupings (such as stack, memory transfer, arithmetic) to be analysed. Performance is measured, by recording the program execution time of each benchmark program as it is executed on the various test systems. These test systems comprise the Intel Pentium 3/700 MHz, Intel 80486 DX2/66 MHz, Intel 8086/8 MHz, and Bochs, a native 80x86 emulator. The Bochs emulator uses a similar interpretive architecture to the portable emulation proposed in this paper. The Pentium 3/700 MHz system is used to execute the portable emulation, as well as the Bochs emulator. Casey. Portable Emulation Proceedings of the First Australian Undergraduate Students’ Computing Conference, 2003 page 22 2.68 0.76 0.08 0.23 2.75E-03 0.00 0.50 1.00 1.50 2.00 2.50 3.00

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid-Prototyping Emulation System Co-emulation Modelling Interface for SystemC Real-Time Emulation

This paper describes the Communications Interface Protocol that was implemented successfully as a co-emulation modelling interface between SystemC model and a reconfigurable hardware platform. The information presented represents part of research into the suitability of using SystemC, in conjunction with a suitable reconfigurable hardware system platform, to provide a real-time emulation enviro...

متن کامل

Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.

Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...

متن کامل

Rapid-Prototyping Emulation System using a SystemC Control System Environment and Reconfigurable Multimedia Hardware Development Platform

This paper describes research into the suitability of using SystemC for rapid prototyping of embedded systems. SystemC[1][2] communication interface protocols[3][4] are interfaced with a reconfigurable hardware system platform to provide a real-time emulation environment, allowing SystemC simulations to be directly translated into real-time solutions. The consequent Rapid Prototyping Emulation ...

متن کامل

MobLab: A Mobility Emulation Platform

Mobility powered systems provide the core routing mechanism in many ad-hoc and delay tolerant networks. Evaluating such systems under real life scenarios is often not practical because they involve multiple moving participants over a wide area. As a result, the principal way these systems are evaluated is though discrete event simulators that are often specific to the system at hand. While thes...

متن کامل

QEMU, a Fast and Portable Dynamic Translator

We present the internals of QEMU, a fast machine emulator using an original portable dynamic translator. It emulates several CPUs (x86, PowerPC, ARM and Sparc) on several hosts (x86, PowerPC, ARM, Sparc, Alpha and MIPS). QEMU supports full system emulation in which a complete and unmodified operating system is run in a virtual machine and Linux user mode emulation where a Linux process compiled...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003